Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis
نویسندگان
چکیده
In contrast to widely used waveform concatenation methods, the presented approach to speech synthesis relies on a parametric analysis–re-synthesis technique, where the features extracted in the analysis stage are modeled by hidden Markov models (HMMs). Many important improvements in the last decade have helped this approach to reach impressive performance. Additionally, its inherent flexibility makes it suitable for advanced speech synthesis tasks, like speaker adaptation, speaker interpolation, emotional speech, etc. In this work, a flexible multi-dialect HMM-based speech synthesis system for Austrian German and Viennese dialect/sociolect is presented. A novel contribution is the interpolation of dialects, where we have to deal with phonological processes that change the segmental structure of the utterance. Evaluation results show that listeners do perceive both continuous and categorical changes of varieties.
منابع مشابه
Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis
An HMM-based speech synthesis framework is applied to both Standard Austrian German and a Viennese dialectal variety and several training strategies for multi-dialect modeling such as dialect clustering and dialect-adaptive training are investigated. For bridging the gap between processing on the level of HMMs and on the linguistic level, we add phonological transformations to the HMM interpola...
متن کاملModeling Austrian dialect varieties for TTS
In this paper we discuss certain strategies for building adapted TTS systems for dialectal or regional varieties from a given standard source. The basic question is how much recoding is necessary for a given transfer and to what extent it is possible to rely on the speech data alone. It will turn out that there are ambiguities that cannot be resolved without a certain amount of linguistic engin...
متن کاملOptimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis
While developing lexical resources for a particular language variety (Viennese), we experimented with a set of 5 different phonetic encodings, termed phone sets, used for unit selection speech synthesis. We started with a very rich phone set based on phonological considerations and covering as much phonetic variability as possible, which was then reduced to smaller sets by applying transformati...
متن کاملReducing Segmental Duration Variation by Local Speech Rate Normalization of Large Spoken Language Resources
We developed a time-domain normalization procedure which uses a speech signal and its corresponding speech rate contour as an input, and produces the normalized speech signal. Then we normalized the speech rate of a large spoken language resource of German read speech. We compared the resulting segment durations with the original durations using several three-way ANOVAs with phone type and spea...
متن کاملComparison of dialect models and phone mappings in HSMM-based visual dialect speech synthesis
In this paper we evaluate two different methods for the visual synthesis of Austrian German dialects with parametric HiddenSemi-Markov-Model (HSMM) based speech synthesis. One method uses visual dialect data, i.e. visual dialect recordings that are annotated with dialect phonetic labels, the other methods uses a standard visual model and maps dialect phones to standard phones. This second metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009